A Hybrid TTS Approach for Prosody and Acoustic Modules
نویسندگان
چکیده
Unit selection (US) TTSs generate quite natural speech but highly variable in quality. Statistical parametric (SP) systems offer far more consistent quality but reduced naturalness due to its vocoding nature. We present a hybrid approach (HA) that tries to improve the overall naturalness combining both synthesis methods. Contrary to other works, the fusion of methods is performed both in prosody and acoustic modules yielding a more robust prosody prediction and achieving greater naturalness. Objective and subjective experiments show the validity of our procedure.
منابع مشابه
An Adaptable Acoustic Architecture in a Multilingual TTS System
In this paper an adaptable acoustical architecture in a multilingual TTS system is presented. The whole architecture is designed to be a data-driven system. Modules comprising text preprocessing, grapheme-to-phoneme conversion, lexical stress detection, OOV-handling, symbolic prosody prediction, acoustic prosody prediction and unit selection with concatenation use machine learning techniques es...
متن کاملModular Design for Mandarin Text-to-speech Synthesis
In the European Union funded project Technology and Corpora for Speech-to-Speech Translation (TC-STAR) [3], we have developed a modular concatenative TTS system for Mandarin Chinese. A common architecture has been introduced based on well-defined modules and interfaces. Three main modules, text processing, prosody processing and acoustic synthesis modules, are used following a commonly employed...
متن کاملA new Japanese TTS system based on speech-prosody database and speech modification
This paper describes a new Japanese text-to-speech (TTS) system that can produce highly natural and intelligible synthetic speech. The good performance of the new TTS system derives from three new sophisticated approaches as follows; (1)A new prosody control algorithm that uses prosody data extracted from a natural speech database and a duration control algorithm based on statistical estimation...
متن کاملModular Text-to-Speech Synthesis Evaluation for Mandarin Chinese
Proper evaluation can efficiently drive the development of text-tospeech (TTS) systems. The assessment is needed to determine how well a system or technique compares to others or how it compares with the previous version of the system. In order to obtain more useful feedback for the development, we do not only evaluate the whole system but also each module of the TTS system separately. Based on...
متن کاملThe Papageno TTS System
One activity of Siemens in the TC-STAR project is to develop a high-quality text-to-speech (TTS) system for UK English. Our main focus is the improvement of the text preprocessing and the acoustic synthesis. Therefore in the second evaluation we took part with the text preprocessing module (task M1) and the whole system (tasks S1 and S2) for UK English. In this article the three modules text pr...
متن کامل